A Linearly-Convergent Stochastic L-BFGS Algorithm
نویسندگان
چکیده
We propose a new stochastic L-BFGS algorithm and prove a linear convergence rate for strongly convex and smooth functions. Our algorithm draws heavily from a recent stochastic variant of L-BFGS proposed in Byrd et al. (2014) as well as a recent approach to variance reduction for stochastic gradient descent from Johnson and Zhang (2013). We demonstrate experimentally that our algorithm performs well on large-scale convex and non-convex optimization problems, exhibiting linear convergence and rapidly solving the optimization problems to high levels of precision. Furthermore, we show that our algorithm performs well for a wide-range of step sizes, often differing by several orders of magnitude.
منابع مشابه
Randomized Quasi-Newton Updates Are Linearly Convergent Matrix Inversion Algorithms
We develop and analyze a broad family of stochastic/randomized algorithms for inverting a matrix. We also develop specialized variants maintaining symmetry or positive definiteness of the iterates. All methods in the family converge globally and linearly (i.e., the error decays exponentially), with explicit rates. In special cases, we obtain stochastic block variants of several quasiNewton upda...
متن کاملStochastic L-BFGS Revisited: Improved Convergence Rates and Practical Acceleration Strategies
We revisit the stochastic limited-memory BFGS (L-BFGS) algorithm. By proposing a new framework for analyzing convergence, we theoretically improve the (linear) convergence rates and computational complexities of the stochastic LBFGS algorithms in previous works. In addition, we propose several practical acceleration strategies to speed up the empirical performance of such algorithms. We also pr...
متن کاملOn the superlinear convergence of the variable metric proximal point algorithm using Broyden and BFGS matrix secant updating
In previous work, the authors provided a foundation for the theory of variable metric proximal point algorithms in Hilbert space. In that work conditions are developed for global, linear, and super–linear convergence. This paper focuses attention on two matrix secant updating strategies for the finite dimensional case. These are the Broyden and BFGS updates. The BFGS update is considered for ap...
متن کاملNew Quasi-Newton Optimization Methods for Machine Learning
This thesis develops new quasi-Newton optimization methods that exploit the wellstructured functional form of objective functions often encountered in machine learning, while still maintaining the solid foundation of the standard BFGS quasi-Newton method. In particular, our algorithms are tailored for two categories of machine learning problems: (1) regularized risk minimization problems with c...
متن کاملA Progressive Batching L-BFGS Method for Machine Learning
The standard L-BFGS method relies on gradient approximations that are not dominated by noise, so that search directions are descent directions, the line search is reliable, and quasi-Newton updating yields useful quadratic models of the objective function. All of this appears to call for a full batch approach, but since small batch sizes give rise to faster algorithms with better generalization...
متن کامل